iot unit
TIFeD: a Tiny Integer-based Federated learning algorithm with Direct feedback alignment
Colombo, Luca, Falcetta, Alessandro, Roveri, Manuel
Training machine and deep learning models directly on extremely resource-constrained devices is the next challenge in the field of tiny machine learning. The related literature in this field is very limited, since most of the solutions focus only on on-device inference or model adaptation through online learning, leaving the training to be carried out on external Cloud services. An interesting technological perspective is to exploit Federated Learning (FL), which allows multiple devices to collaboratively train a shared model in a distributed way. However, the main drawback of state-of-the-art FL algorithms is that they are not suitable for running on tiny devices. For the first time in the literature, in this paper we introduce TIFeD, a Tiny Integer-based Federated learning algorithm with Direct Feedback Alignment (DFA) entirely implemented by using an integer-only arithmetic and being specifically designed to operate on devices with limited resources in terms of memory, computation and energy. Besides the traditional full-network operating modality, in which each device of the FL setting trains the entire neural network on its own local data, we propose an innovative single-layer TIFeD implementation, which enables each device to train only a portion of the neural network model and opens the door to a new way of distributing the learning procedure across multiple devices. The experimental results show the feasibility and effectiveness of the proposed solution. The proposed TIFeD algorithm, with its full-network and single-layer implementations, is made available to the scientific community as a public repository.
Distributed Deep Convolutional Neural Networks for the Internet-of-Things
Disabato, Simone, Roveri, Manuel, Alippi, Cesare
Due to the high demand in computation and memory, deep learning solutions are mostly restricted to high-performance computing units, e.g., those present in servers, Cloud, and computing centers. In pervasive systems, e.g., those involving Internet-of-Things (IoT) technological solutions, this would require the transmission of acquired data from IoT sensors to the computing platform and wait for its output. This solution might become infeasible when remote connectivity is either unavailable or limited in bandwidth. Moreover, it introduces uncertainty in the "data production to decision making"-latency, which, in turn, might impair control loop stability if the response should be used to drive IoT actuators. In order to support a real-time recall phase directly at the IoT level, deep learning solutions must be completely rethought having in mind the constraints on memory and computation characterizing IoT units. In this paper we focus on Convolutional Neural Networks (CNNs), a specific deep learning solution for image and video classification, and introduce a methodology aiming at distributing their computation onto the units of the IoT system. We formalize such a methodology as an optimization problem where the latency between the data-gathering phase and the subsequent decision-making one is minimized. The methodology supports multiple IoT sources of data as well as multiple CNNs in execution on the same IoT system, making it a general-purpose distributed computing platform for CNN-based applications demanding autonomy, low decision-latency, and high Quality-of-Service.
- Europe > Italy > Lombardy > Milan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Switzerland (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Information Technology > Smart Houses & Appliances (1.00)
- Education (1.00)